11 research outputs found

    Processing Regular Path Queries on Arbitrarily Distributed Data

    Get PDF
    Regular Path Queries (RPQs) are a type of graph query where answers are pairs of nodes connected by a sequence of edges matching a regular expression. We study the techniques to process such queries on a distributed graph of data. While many techniques assume the location of each data element (node or edge) is known, when the components of the distributed system are autonomous, the data will be arbitrarily distributed. As the different query processing strategies are equivalently costly in the worst case, we isolate query-dependent cost factors and present a method to choose between strategies, using new query cost estimation techniques. We evaluate our techniques using meaningful queries on biomedical data

    Provabs: Model, policy, and tooling for abstracting PROV graphs

    No full text
    Provenance metadata can be valuable in data sharing settings, where it can be used to help data consumers form judgements regarding the reliability of the data produced by third parties. However, some parts of provenance may be sensitive, requiring access control, or they may need to be simplified for the intended audience. Both these issues can be addressed by a single mechanism for creating abstractions over provenance, coupled with a policy model to drive the abstraction. Such mechanism, which we refer to as abstraction by grouping, simultaneously achieves partial disclosure of provenance, and facilitates its consumption. In this paper we introduce a formal foundation for this type of abstraction, grounded in the W3C PROV model; describe the associated policy model; and briefly present its implementation, the Provabs tool for interactive experimentation with policies and abstractions.Comment: In Procs. IPAW 2014 (Provenance and Annotations). Koln, Germany: Springer, 201

    Identifying equivalent relation paths in knowledge graphs

    Get PDF
    Relation paths are sequences of relations with inverse that allow for complete exploration of knowledge graphs in a two-way unconstrained manner. They are powerful enough to encode complex relationships between entities and are crucial in several contexts, such as knowledge base verification, rule mining, and link prediction. However, fundamental forms of reasoning such as containment and equivalence of relation paths have hitherto been ignored. Intuitively, two relation paths are equivalent if they share the same extension, i.e., set of source and target entity pairs. In this paper, we study the problem of containment as a means to find equivalent relation paths and show that it is very expensive in practice to enumerate paths between entities. We characterize the complexity of containment and equivalence of relation paths and propose a domain-independent and unsupervised method to obtain approximate equivalences ranked by a tri-criteria ranking function. We evaluate our algorithm using test cases over real-world data and show that we are able to find semantically meaningful equivalences efficiently

    A formal framework for comparing linked data fragments

    No full text
    The Linked Data Fragment (LDF) framework has been proposed as auniform view to explore the trade-offs of consuming Linked Data when serversprovide (possibly many) different interfaces to access their data. Every such in-terface has its own particular properties regarding performance, bandwidth needs,caching, etc. Several practical challenges arise. For example, before exposing anew type of LDFs in some server, can we formally say something about how thisnew LDF interface compares to other interfaces previously implemented in thesame server? From the client side, given a client with some restricted capabilitiesin terms of time constraints, network connection, or computational power, whichis the best type of LDFs to complete a given task? Today there are only a fewformal theoretical tools to help answer these and other practical questions, andresearchers have embarked in solving them mainly by experimentation.In this paper we propose theLinked Data Fragment Machine(LDFM) which isthe first formalization to model LDF scenarios. LDFMs work as classical Tur-ing Machines with extra features that model the server and client capabilities. Byproving formal results based on LDFMs, we draw a fairly completeexpressive-ness latticethat shows the interplay between several combinations of client andserver capabilities. We also show the usefulness of our model to formally analyzethe fine grain interplay between several metrics such as the number of requestssent to the server, and the bandwidth of communication between client and server
    corecore